Speech recognition error analysis on the English MALACH corpus
نویسندگان
چکیده
This paper presents an analysis of the word recognition error rate on an English subset of the MALACH corpus. The MALACH project is an NSF-funded research program related to the development of multilingual access to large audio archives. The archive of interest is a large collection of testimonies from 52,000 survivors, liberators, rescuers and witnesses of the Nazi Holocaust, assembled by the Shoah Visual History Foundation. This data has some unique characteristics that make it quite unusual in the speech recognition community such as elderly speech, noisy conditions, heavily accented speech. Hence, it is a challenging task for automatic speech recognition (ASR). This paper attempts to identify the factors affecting the ASR performance on that task. It was found that the signal-to-noise ratio and syllable rate were two dominant factors in explaining the overall word error rate, while we observed no evidence of the impact of accent and speaker’s age on the recognition performance. Based on this evidence, noise compensation experiments were carried out and led to a 1.1% absolute reduction of the word error rate.
منابع مشابه
Towards automatic transcription of large spoken archives - English ASR for the MALACH project
Digital archives have emerged as the pre-eminent method for capturing the human experience. Before such archives can be used efficiently, their contents must be described. The NSF-funded MALACH project aims to provide improved access to large spoken archives by advancing the state-of-the-art in automated speech recognition (ASR), Information Retrieval (IR) and related technologies [1, 2] for mu...
متن کاملError Analysis of Taiwanese University Students’ English Essay Writing: A Longitudinal Corpus Study
Writing is considered one of the most difficult skills in EFL/ESL. Thus, meticulous recognition and classification of students’ errors in certain contexts is a worthwhile endeavor which provides us with both diagnostic and prognostic power. Accordingly, a total of 430 students in 15 English writing classes held during 12 consecutive semesters in a private university in central Taiwan were the s...
متن کاملTowards Automatic Transcription of Large Spoken Archives in Agglutinating Languages - Hungarian ASR for the MALACH Project
The paper describes automatic speech recognition experiments and results on the spontaneous Hungarian MALACH speech corpus. A novel morph-based lexical modeling approach is compared to the traditional wordbased one and to another, previously best performing morph-based one in terms of word and letter error rates. The applied language and acoustic modeling techniques are also detailed. Using uns...
متن کاملInformation Access in Large Spoken Archives
Digital archives have emerged as the pre-eminent method for capturing the human experience. Before such archives can be used efficiently, their contents must be described. The scale of such archives along with the associated content mark up cost make it impractical to provide access via purely manual means, but automatic technologies for search in spoken materials still have relatively limited ...
متن کاملUse of metadata to improve recognition of spontaneous speech and named entities
With improved recognition accuracies for LVCSR tasks, it has become possible to search large collections of spontaneous speech for a variety of information. The MALACH corpus of Holocaust testimonials is one such collection, in which we are interested in automatically transcribing and retrieving portions that are relevant to named entities such as people, places, and organizations. Since the te...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004